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ABSTRACT 

The  sensitivity  of  model  forecasts  to  uncertainties  in  control  variables  is  evaluated  using  the  adjoint 
technique  and  the  ensemble  generated  by  the  reduced-order  four-dimensional  variational  data  assimilation 
(R4DVAR)  algorithm  within  the  framework  of  twin-data  experiments  with  a  quasigeostrophic  model.  To 
simulate  real  applications  where  the  true  state  is  unknown,  the  sensitivities  were  estimated  using  model 
solutions  that  were  obtained  after  assimilating  sparse  observations  extracted  from  the  true  solutions.  The 
numerical  experiments  were  conducted  in  the  linear,  weakly  nonlinear,  and  strongly  nonlinear  (NL)  regimes 
with  special  emphasis  on  the  NL  case  characterized  by  the  instability  of  the  tangent  linear  model.  It  is  shown 
that  the  ensemble-based  R4DVAR  method  provides  better  sensitivity  estimates  in  the  NL  case,  primarily  due 
to  the  better  accuracy  of  the  optimized  solutions.  The  concept  of  sensitivity  in  the  NL  case  is  also  considered 
within  the  statistical  framework.  Using  analytical  arguments  and  numerical  experimentation,  averaging  the 
adjoint  sensitivity  estimates  over  an  ensemble  of  model  trajectories  generated  by  finite  perturbations  of  the 
optimal  control  is  shown  to  provide  an  estimate  similar  to  that  obtained  with  the  adjoint  model  stabilized  by 
enhanced  dissipation.  This  observation  allows  for  evaluation  of  the  sensitivities  of  strongly  nonlinear  optimal 
solutions  by  using  both  the  adjoint  (4DVAR)  and  ensemble  (R4DVAR)  optimization  algorithms. 


1.  Introduction 

Sensitivity  analysis  is  one  of  the  important  features  of 
data  assimilation  algorithms  because  it  allows  the  de¬ 
termination  of  how  changes  in  the  control  variables  of 
a  numerical  model  (e.g.,  initial  conditions)  affect  the 
target  quantities  (TQs)  of  interest  (e.g.,  temperature  in 
a  certain  domain)  at  the  time  of  the  model  forecast.  This 
type  of  analysis  can  be  extended  further  by  combining 
sensitivity  estimates  with  information  about  the  prior 
observational  and  background  errors  to  assess  the  im¬ 
pact  of  new  observations  on  the  a  posteriori  errors  of  the 
TQs  (e.g..  Baker  and  Daley  2000). 

In  the  past  decade,  many  general  circulation  models 
have  been  supplied  with  their  adjoint  code.  This  de¬ 
velopment  has  enabled  a  particular  type  of  sensitivity 
analysis  based  on  the  application  of  the  adjoint  models 
to  obtain  the  derivatives  of  the  TQs  with  respect  to 
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a  variety  of  variables  controlling  the  model  solutions.  To 
cite  a  few  oceanographic  examples  of  the  “adjoint  sen¬ 
sitivity”  (AS)  approach,  Lee  et  al.  (2001)  estimated  the 
sensitivity  of  the  Indonesian  Throughflow  to  remote 
wind  forcing,  Losch  and  Heimbach  (2007)  assessed  the 
AS  of  model  solutions  to  the  bottom  topography,  while 
Moore  et  al.  (2009)  and  Veneziani  et  al.  (2009)  con¬ 
ducted  a  comprehensive  AS  analysis  in  the  California 
Current  region. 

The  major  limitation  of  the  AS  approach  is  the  as¬ 
sumption  that  models  can  be  effectively  linearized  in  the 
vicinity  of  the  optimal  trajectory  to  compute  sensitivities 
with  respect  to  infinitesimal  perturbations  of  the  control 
variables  over  finite  time  intervals.  However,  this  as¬ 
sumption  does  not  work  in  many  applications,  especially 
in  cases  involving  strongly  nonlinear  background  states 
characterized  by  exponential  growth  of  the  infinitesimal 
perturbations.  To  cope  with  this  issue,  Hoteit  et  al.  (2005, 
2010)  introduced  stabilization  of  the  adjoint  model  by 
artificially  enhancing  its  dissipation  to  a  level  that  sup¬ 
pressed  the  unstable  modes. 

Partly  due  to  the  above-mentioned  limitation  of  the 
adjoint  technique,  ensemble  methods  of  data  assimilation 
have  experienced  rapid  development  in  recent  years.  The 
ensemble  approach  is  capable  of  better  handling  the 
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quasi-chaotic  model  behavior  induced  by  nonlinearities, 
does  not  require  development  and  maintenance  of  the 
adjoint  code,  and  appears  to  be  consistent  with  the 
modern  trend  in  computer  technology  of  massive  paral¬ 
lelization.  The  ensemble  sensitivity  (ES)  analysis  (e.g., 
Ancell  and  Hakim  2007;  Torn  and  Hakim  2008)  follows 
the  basic  principle  of  perturbing  a  model  by  the  ensemble 
members  and  analyzing  the  respective  responses  of  the 
TQs.  In  contrast  to  the  AS,  this  approach  is  naturally 
restricted  to  the  subspace  spanned  by  the  ensemble 
members,  but  the  ensemble  members  tend  to  cover  the 
dynamically  significant  manifold  that  accounts  for  the 
major  part  of  the  model  variability  (Farrell  and  Ioannou 
2005). 

Of  certain  interest  are  the  methods  merging  the  en¬ 
semble  approach  with  traditional  variational  assimila¬ 
tion  techniques  (e.g.,  Cao  et  al.  2007;  Liu  et  al.  2008; 
Yaremchuk  et  al.  2009;  Clayton  et  al.  2013).  In  particular, 
Yaremchuk  et  al.  (2009)  demonstrated  that  the  reduced- 
order  four-dimensional  variational  data  assimilation 
(R4DVAR)  algorithm  appears  to  be  advantageous  in 
handling  nonlinear  optimizations  and  is  comparable  in 
computational  cost  with  the  4DVAR  technique.  The 
ES  analysis  has  been  considered  in  detail  by  a  number 
of  authors  (e.g.,  Torn  and  Hakim  2009;  Zack  et  al.  2010) 
in  application  to  the  ensemble  methods  that  keep  the 
ensemble  size  fixed  in  the  process  of  sequential  assimi¬ 
lation.  The  hybrid  R4DVAR  method  considered  in  this 
paper  produces  a  sequence  of  ensembles  spanning  the 
Krylov  subspaces  that  account  for  the  most  dynamically 
significant  error  components  and  can  be  naturally  used  in 
the  ES  analysis  of  the  optimized  solution.  A  similar  se¬ 
quence  (of  search  directions)  is  generated  in  the  process 
of  4DVAR  minimization  that  can  be  used  for  ES  analysis 
of  4DVAR  solutions. 

In  this  study  we  compare  the  accuracy  of  the  adjoint 
4DVAR  and  the  R4DVAR  sensitivity  analysis  tech¬ 
niques  using  a  quasigeostrophic  model  of  intermediate 
complexity.  To  mimic  real  applications,  when  the  true 
state  is  unknown  and  sensitivities  are  computed  using 
linearization  around  the  available  optimal  state,  we  adopt 
the  following  experimentation  procedure.  First,  true 
states  are  generated  from  model  integrations  and  ob¬ 
servations  are  sampled  from  these  true  states  for  the  data 
assimilation  experiments  (using  R4DVAR  and  4DVAR 
techniques)  to  obtain  "optimal”  states.  The  sensitivity 
experiments  are  then  conducted  with  these  optimal  states 
serving  as  the  model  trajectories  and  results  of  the  ex¬ 
periments  are  compared  with  each  other  using  the  sen¬ 
sitivities  computed  from  the  "true”  model  trajectories  as 
benchmarks.  Special  attention  is  given  to  the  experiments 
in  the  strongly  nonlinear  case  characterized  by  instability 
of  the  tangent  linear  (TL)  model. 


The  paper  is  organized  as  follows.  In  the  next  section 
we  briefly  provide  an  overview  of  the  basic  relationships 
of  the  linear  sensitivity  analysis,  discuss  the  sensitivity 
concept  in  strongly  nonlinear  geophysical  flows,  and  show 
that  averaging  sensitivity  estimates  over  realizations  of  a 
turbulent  flow  can  be  approximated  by  a  single  AS  esti¬ 
mate  generated  by  a  stabilized  adjoint  model.  In  section  3, 
the  methodology  of  the  numerical  experiments  with  the 
4DVAR  and  R4DVAR  systems  is  described.  Results  of 
the  experiments,  which  include  consequent  treatment  of 
the  linear,  weakly  nonlinear,  and  strongly  nonlinear  cases, 
are  presented  in  section  4.  The  results  are  summarized  and 
discussed  in  section  5. 

2.  Sensitivity  analysis 

a.  Linear  case 

Let  c  =  {c“,  a  =  1 ,...,«}  denote  the  ^-dimensional 
vector  of  control  variables  (e.g.,  initial  conditions)  of  a 
numerical  model  described  by  the  nonlinear  operator  Af. 
The  operator  A f  maps  the  control  vector  onto  the  model 
trajectory  represented  by  the  /V'-dimensional  vector  X  = 
{Y*,  k  =  1,  . . .  ,  N}  of  all  the  gridded  model  fields  in 
a  given  space-time  domain  of  the  model’s  operation. 
Assuming  differentiability  of  Af,  small  perturbations 
5c  of  the  control  variables  result  in  small  perturbations 
of  the  model  trajectory  5X  that  are  linearly  related  to 
5c  via  the  tangent  linear  model; 

5  Xk=JJLk8ca.  (1) 

a 

Here,  denote  the  elements  of  the  N  X  n  matrix  L 
representing  the  TL  mapping  (linearization  of  Af  in  the 
vicinity  of  c).  To  distinguish  between  the  vectors  from 
the  spaces  of  model  trajectories  and  control  variables, 
their  components  are  enumerated  by  the  Latin  and  Greek 
indices,  respectively. 

Consider  now  a  (scalar)  target  quantity  q  of  interest 
(e.g.,  transport  through  a  certain  section  in  the  model 
domain  averaged  over  a  certain  period  of  time)  repre¬ 
sented  by  a  linear  function  q  =  {Qk,  k  =  1, . . . ,  N}  of  X: 
q  =  lkQkXk. 

Taking  (1)  into  account,  perturbations  Sq  of  the  TQ 
can  be  expressed  in  terms  of  5c“: 

Sq  =  I  Qksxk  =  I  QkLk8ca  =  I  YaSc“ ,  (2) 

k  k,a  a 

where  the  n-dimensional  vector  s  with  components 

sa  =  X  L«Qk  (3) 

k 
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explicitly  provides  the  derivatives  (sensitivity)  of  the  TQ 
with  respect  to  the  (variations  of  the)  control  variables. 
In  practice,  the  computation  of  s  requires  coding  the 
convolution  of  the  transpose  of  the  TL  mapping  LT  with 
the  iV-dimensional  vector  q  describing  the  target  quan¬ 
tity  (adjoint  model  integration  forced  by  Qk).  An  esti¬ 
mate  of  s  obtained  by  the  direct  computation  with  (3)  is 
often  called  the  adjoint  sensitivity  (e.g.,  Ancell  and 
Hakim  2007). 

When  the  adjoint  code  is  not  available,  sensitivity  can 
be  formally  computed  by  the  “brute  force”  method: 
Consider  an  ensemble  of  n  linearly  independent  per¬ 
turbations  of  the  control  vector  (5c, • ,  i  =  1,  . . .  ,  77}  that 
cause  the  respective  perturbations  of  the  TQ,  Sqj  = 
Yjasa8cf,  and  assess  the  impact  of  5c,-  on  the  TQ  by 
computing  the  quantity 

Pa  =  1 8qScf  =  I  s  8c? 8c?  -  I  s  ,  (4) 

i  u 3  P  n 

where  BaP  =  2;5cf 8c f  is  the  n  X  n  matrix  formed  by  the 
sum  of  the  outer  products  of  the  ensemble  perturbations. 
If  we  denote  summation  over  i  by  ()  and  interpret  it  as  the 
“ensemble  average,”  the  quantities  on  the  left-  and  right- 
hand  sides  of  (4)  can  be  identified  as  being  proportional 
to  the  covariances  p  =  ( Sq8c )  and  B  =  (5c5cT).  If  p  and 
B  are  available  from  the  ensemble  of  model  runs,  the 
sensitivity  is  readily  obtained  by  the  simple  relationship 
[cf.  (4)] 

s=B-y  (5) 

which  is  identical  to  the  ensemble  sensitivity  formula 
introduced  by  Ancell  and  Hakim  (2007).  The  only  dif¬ 
ference  is  that,  in  the  purely  statistical  interpretation, 
covariances  are  computed  with  prior  removal  of  their 
respective  means. 

In  real  applications,  the  values  of  n  and  N  are  quite 
large  (106-109)  and  the  relationship  (5)  appears  to  be 
impractical  due  to  the  large  size  of  the  required  en¬ 
semble.  However,  as  has  been  already  shown  (Ancell 
and  Hakim  2007;  Torn  and  Hakim  2008),  meaningful 
results  can  be  obtained  by  inverting  B  in  the  subspace 
spanned  by  the  available  ensemble  members  [i.e.,  re¬ 
placing  B  1  in  (5)  by  its  pseudoinverse].  In  this  case,  the 
accuracy  of  (5)  depends  on  how  well  q  can  be  approxi¬ 
mated  by  a  linear  combination  of  the  ensemble  members 
5c,.  In  many  cases  TQs  are  represented  by  8  functions 
(e.g.,  temperature  at  a  certain  point)  that  have  a  wide 
spectrum.  Ensemble  members,  on  the  other  hand,  are 
usually  associated  with  the  most  dynamically  persis¬ 
tent  modes  and,  therefore,  tend  to  have  a  relatively 
smooth  spatial  variation.  As  a  consequence,  practical 


computations  of  s  via  (5)  usually  result  in  smoother  sen¬ 
sitivity  maps  relative  to  the  maps  obtained  by  directly 
computing  the  components  of  s  using  the  adjoint  code  (3). 

In  most  applications,  however,  the  background  flow  is 
characterized  by  strong  nonlinearities,  which  cause  the 
TL  operator  (and  its  adjoint)  to  have  exponentially 
growing  (unstable)  modes.  As  a  consequence,  the  AS 
analysis  becomes  impractical  for  forecast  periods  ex¬ 
ceeding  the  e-folding  times  of  the  unstable  modes. 

b.  Nonlinear  case 

It  is  noteworthy  that  the  operator  N  does  not  have  the 
instabilities  of  its  tangent  linear  mapping  because  N  is 
constrained  by  conservation  laws  that  prevent  the  orig¬ 
inal  model  solutions  from  uncontrollable  amplification. 
In  that  sense,  the  ES  approach  may  appear  to  be  more 
practical  in  the  case  of  TL  instability  (TLI).  Numerical 
implementation  of  this  approach  has  to  be  made  with 
special  care,  because  the  ensemble  perturbations  can  be 
neither  too  small  (to  avoid  the  effect  of  the  TLIs)  nor  too 
large  relative  to  the  reference  solution.  In  the  latter  case 
the  magnitude  of  the  ensemble  perturbations  should 
certainly  be  smaller  than  the  magnitude  of  the  optimal 
state,  whose  properties  are  being  explored  by  the  ES 
analysis. 

Let  us  consider  the  case  when  the  forecast  time  T  is 
considerably  larger  than  smallest  TLI  time  scale,  a  not 
restrictive  constraint  in  the  majority  of  applications.  In 
that  case  we  can  still  apply  the  AS  algorithm  (3),  al¬ 
though  the  result  will  be  severely  contaminated  by  the 
small-scale  noise  caused  by  TLIs.  One  may  expect  that 
averaging  the  AS  estimates  (3)  over  a  suitable  ensemble 
of  the  background  states  (i.e.,  over  the  ensemble  of  the 
matrices  Lka)  will  decrease  the  TLI-generated  noise  and 
preserve  a  meaningful  sensitivity  signal  that  could  be 
interpreted  as  the  mean  sensitivity  of  the  TQ  given  the 
uncertainty  of  the  optimal  state. 

For  the  sake  of  simplicity,  let  us  assume  that  the  TQ 
does  not  involve  time  averaging  (e.g.,  the  mean  temper¬ 
ature  in  a  certain  region  at  forecast  time  T).  In  this  case 
the  TL  model  controlled  by  the  initial  conditions  c  is 

r)\ 

—  =  M(r)x;  x(0)  =  c,  (6) 

where  M  is  the  dynamical  operator  that  has  been  line¬ 
arized  in  the  vicinity  of  the  optimal  trajectory.  In  this 
case  the  TL  model  state  at  the  forecast  time  T  can  be 
explicitly  expressed  by 

x(T)  =  exp[TM]c,  (7) 

whereas  the  AS  estimate  is  given  by  the  adjoint  of  (7): 
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s  =  (exp[riUI])Tq  =  exp[TMT]q,  (8) 


where  the  overbar  indicates  for  the  time  average 


•T 

o 


M(f)  dt. 


Let  us  further  assume  that  M  is  “advection  dominated”; 
that  is,  |  M|  ~  |u-  V|,  where  u  is  the  optimized  velocity  field 
and  V  is  the  gradient  operator.  Under  this  condition, 
perturbing  M  is  approximately  equivalent  to  adding  per¬ 
turbations  v  to  the  velocity  field,  so  that  a  member  of  the 
ensemble  of  perturbed  AS  estimates  takes  the  form 


s'  =  exp[7XMT-vV)]q.  (9) 


If  the  forecast  time  T  is  much  larger  than  the  decorre¬ 
lation  time  scale  of  the  perturbations  v,  v  can  be  treated 
as  a  Gaussian  field  because,  by  its  definition,  v  is  pro¬ 
portional  to  the  sum  of  a  large  number  of  uncorrelated 
fields  v.  With  an  additional  assumption  of  zero  mean,  one 
can  perform  averaging  over  the  ensemble  to  obtain 

(exp[L(MT  -  v  ■  V)])  =  exp[LM 1  +  VDV] ,  (10) 

where  the  angular  brackets  denote  the  ensemble  aver¬ 
age  and  D  is  the  effective  diffusion  tensor  proportional 
to  the  product  of  the  time-averaged  velocity  covariance 
and  the  forecast  time  (see  the  appendix).  In  other  words, 
one  may  expect  that  the  result  of  averaging  the  AS  esti¬ 
mates  over  an  ensemble  of  random  perturbations  of  the 
optimized  state  will  be  equivalent  to  performing  a  single 
AS  computation  with  the  appropriately  modified  diffu¬ 
sion.  This  observation  provides  significant  computational 
savings  when  computing  both  the  AS  (3)  and  ES  (5)  es¬ 
timates  in  the  strongly  nonlinear  case. 

In  the  following  sections,  we  compare  the  accuracy 
and  efficiency  of  the  AS  and  ES  methods  within  the 
framework  of  twin-data  experiments  with  the  4DVAR 
and  R4DVAR  data  assimilation  systems  in  a  nonlinear 
QG  model  (Yaremchuk  et  al.  2009).  The  R4DVAR 
method  has  many  features  of  the  ensemble  assimilation 
techniques,  as  it  directly  computes  the  cost  function 
gradient  (without  the  adjoint)  for  a  sequence  of  sub¬ 
spaces  spanned  by  the  continuously  updated  ensemble 
members.  Considering  real  applications,  we  adopt  the 
following  experimentation  strategy:  the  AS  and  ES  are 
estimated  with  respect  to  the  optimized  states  obtained 
by,  respectively,  the  4DVAR  and  R4DVAR  methods, 
and  then  these  estimates  are  compared  with  the  sensitivity 
estimates  of  the  “true”  state  (which  has  been  sparsely 
subsampled  to  obtain  the  above-mentioned  optimal 
solutions). 


3.  Methodology 

a.  Experimental  setting 

We  consider  a  QG  model  in  a  square  33  X  33  grid  fl 
with  a  spatial  resolution  of  8  =  15  km: 

d,C  +  /( iA,  Ai//)  +  /3dxi ft  =  eA2i//,  (11) 

At//  —  R^2i//  =  Ci  and  (12) 
i//(dfl)  =  £(dfl)  =  0,  (13) 


where  i//  is  the  stream! unction  in  the  upper  layer,  f3  =  2  X 
1CL11  m_1  s-1  is  the  meridional  gradient  of  the  Coriolis 
parameter,  Rd  =  30  km  is  the  internal  Rossby  radius  of 
deformation,  and  v  is  the  horizontal  diffusion  co¬ 
efficient.  The  details  of  the  model  numerics  and  spinup 
are  described  in  Yaremchuk  et  al.  (2009). 

To  conduct  the  sensitivity  experiments,  the  “true” 
solution  is  extracted  from  the  45-day  model  run  and 
then  the  streamfunction  is  subsampled  at  16  locations 
on  days  15,  30,  and  45  (Fig.  1).  These  data  are  then  as¬ 
similated  using  either  the  4DVAR  or  R4DVAR  method 
by  minimizing  the  following  cost  function  with  respect 
to  the  control  variables  (initial  values  of  the  potential 
vorticity  £): 


3  K  3 

J0=l  l[dk'Ktn)-4>tn?+ws'L  [A2Mtn)?dSi, 

n—1 k=l  n=0 JO 

(14) 


where  Ok  projects  i//(t„)  onto  the  kth  observation  point, 
/C  =  16  is  the  number  of  observation  points  at  time  level 
n ,  and  Ws  =  0.03S4  is  the  smoothing  weight. 

Three  true  solutions  have  been  used  in  the  sensitivity 
experiments,  all  of  them  starting  with  the  initial  distri¬ 
bution  of  i//  shown  in  Fig.  la.  The  first  solution,  purely 
linear,  with  v  =  300  m2  s_1  was  obtained  by  removing  the 
Jacobian  in  (11).  The  corresponding  distribution  of  the 
streamfunction  at  the  end  of  the  integration  is  shown  in 
Fig.  lb.  The  second,  weakly  nonlinear  (WNL)  solution 
had  diffusion  v  =  300m2s_1  comparable  in  magnitude 
with  advection  to  ensure  stability  of  the  TL  model  (Fig.  lc). 
The  third  (nonlinear,  NL)  solution  (Fig.  Id)  was  advec¬ 
tion  dominated  with  a  grid-scale  dissipation  time  of 
82lv  ~  100  days  {v  =  30  m2  s_1). 

b.  4DVAR  and  R4DVAR  optimization  methods 

To  compare  the  accuracy  of  the  AS  and  ES  estimates, 
we  simulated  the  entire  procedures  of  their  computa¬ 
tion.  At  the  first  stage,  an  optimal  solution  is  obtained 
either  by  using  the  adjoint  model  (4DVAR  method)  or 
by  using  an  ensemble  approach  (R4DVAR).  After  that. 
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Fig.  1.  Streamf unction  (a)  at  the  start  and  at  the  end  of  integration  for  the  (b)  linear,  (c)  WNL,  and  (d)  NL  true 
solutions.  Locations  of  the  TQs  used  in  the  sensitivity  experiments  are  shown  in  (c).  Asterisks  in  (b)-(d)  show 
subsampling  (“observation”)  locations  of  the  true  solution. 


these  optimal  solutions  are  employed  to  assess  sen¬ 
sitivities  in  a  similar  way:  using  the  adjoint  code  (3) 
and  using  the  output  of  the  R4DVAR  optimization 
algorithm  (5). 

The  4DVAR  optimization  computed  the  cost  function 
gradient  with  respect  to  the  vector  of  control  variables 
c  =  £(f0)  using  the  adjoint  code.  The  quasi-Newtoninan 
minimization  algorithm  of  Gilbert  and  Lemarechal  (1989) 
was  used  for  descent. 

The  R4DVAR  optimization  method  minimizes  the 
cost  function  (14)  in  a  sequence  of  low-dimensional 
Krylov  subspaces  K!-\i  =  1,  . . . n„)  of  the  control  space. 
Each  minimization  required  computation  of  the  m  =  15 
perturbed  model  solutions  contributing  to  the  global 
perturbation  ensemble  {5c,},  which  was  later  used  for 


ES  analysis.  Apart  from  computing  projections  of  the 
cost  function  gradient  and  the  Hessian  matrix  on  K!'1 
(Yaremchuk  et  al.  2009),  the  ith  minimization  process 
additionally  calculated  perturbations  of  the  TQs  Sg,  for 
the  ES  estimation.  After  finding  the  suboptimal  control 
c„  the  minimization  problem  was  updated  by 

■/i+i(c)s/«(c  +  c«)-> 

ceVi+i 

and  the  updated  ensemble  (whose  members  span  /C™  , ) 
was  computed  by  taking  m  leading  modes  of  Ar(c,)  and 
orthogonalizing  them  to  /C™  ;  and  K"1 .  This  strategy 
demonstrated  the  fastest  convergence  of  the  R4DVAR 
algorithm,  which  usually  required  nu  ~  30-40  ensemble 
updates  or  mnu  ~  450-600  perturbed  model  runs. 
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Error  in  the  approximation  of  the  true  solutions  i //lrue 
was  estimated  in  terms  of  the  RMS  difference  between 
the  optimized  i//opt  and  true  streamfunctions: 

1/2 

(15) 


With  the  exception  of  the  linear  case,  where  the 
4DVAR-  and  R4DVAR-optimized  solutions  were  nearly 
identical  with  an  RMS  discrepancy  of  e  =  0.19,  in  both 
nonlinear  cases  the  R4DVAR  optimizations  were  better 
with  e  =  0.21  versus  e  =  0.32  in  the  WNL  case  and  e  = 
0.28  versus  e  =  0.44  in  the  NL  case.  Technically,  the 
larger  4DVAR  errors  were  caused  by  the  eventual  loss 
of  the  descent  direction,  which  typically  occurred  after 
100-150  iterations.  On  the  other  hand,  a  lower  value  of  e 
was  achieved  by  the  R4DVAR  at  the  expense  of  larger 
computational  cost:  typically,  the  algorithm  converged 
after  30-40  ensemble  updates,  which  are  equivalent  to 
250-300  iterations  of  the  adjoint  code  in  terms  of  the 
computer  time  (CPU). 

c.  AS  and  ES  experiments 

The  major  TQ  used  in  the  sensitivity  experiments  was 
the  1-day  mean  transport  across  the  section  shown  in 
Fig.  lc: 


O/'opt  ~ '/'true)  dtda 


</£u  edtdCl 


T-U 


[i/j(xvt)  -  i/j(x2,t)\dt,  (16) 


where  t\  =  44  days  and  X]i2  denote  the  locations  of  the 
end  points  of  the  section.  To  explore  the  sensitivity  of 
the  results  to  variations  in  TQ,  we  also  used  similar  1-day 
means  of  the  streamfunction  (or  sea  surface  height, 
SSH)  at  the  point  x3  and  the  horizontal  mean  stream- 
function  (SSFI)  over  the  area  S  surrounding  the  point  x4, 
both  shown  in  Fig.  lc: 


q~  =  i//(x,,t);  q3  =  S  1 


*A(x4)  dS . 


(17) 


Overbars  here  stand  for  the  time  average  in  (16).  Com¬ 
ponents  Qk  of  the  corresponding  operators  were  obtained 
by  taking  the  derivatives  of  (16)— (17)  with  respect  to  the 
gridpoint  values  of  ijj. 

The  diffusion  coefficient  in  the  WNL  case  was  ob¬ 
tained  after  a  series  of  experiments  with  the  response  of 
the  adjoint  model  to  the  forcing  by  the  operators  of  the 
TQs  q1  and  q-.  In  the  NL  case,  the  adjoint  model  ex¬ 
hibited  exponential  growth  of  the  solutions  with  an 
approximate  e-folding  time  t  of  5-7  days  (Fig.  2,  top 


t,  days 


Fig.  2.  Growth  of  the  adjoint  relative  vorticity  a>  associated  with 
the  TQs  q1  (shown  in  black)  and  q2  (gray).  Bottom  lines  correspond 
to  the  adjoint  model  stabilized  with  the  additional  diffusion. 


curves).  To  stabilize  the  adjoint  solution,  an  extra  dif¬ 
fusion  term  with  v  =  260  m2  s  1  was  added  to  diffuse  the 
adjoint  variables. 

The  ES  experiments  utilized  the  sequence  of  ensem¬ 
bles  generated  by  the  R4DVAR  algorithm.  The  ensemble 
members  were  used  to  perturb  the  optimal  solutions  and 
compute  the  covariances  p,  B  in  (5).  Compared  to  the  AS, 
the  ES  experiments  were  computationally  more  intensive 
due  to  the  necessity  of  computing  p,  which  required  a 
number  of  model  runs  equal  to  the  size  of  the  averaging 
ensemble  mnu.  In  practice,  however,  these  extra  CPU 
requirements  could  be  eliminated  by  computing  pertur¬ 
bations  Sqk  in  the  course  of  the  ensemble  model  runs 
performed  during  the  R4DVAR  optimization. 

4.  Results 

a.  Linear  case 

Figure  3  shows  the  AS  and  ES  estimates  in  the  linear 
case  with  (3  =  4  X  10_11m_1s_1.  In  such  a  simplified 
setting,  solutions  of  the  optimization  problem  (14)  by 
the  4DVAR  and  R4DVAR  methods  are  quite  similar, 
with  the  approximation  errors  e  =  0.19. 

The  true  sensitivity  map  obtained  by  the  adjoint 
method  (Fig.  3a)  appears  to  be  accurately  approximated 
(p  =  0.99)  by  the  ES  map  computed  by  averaging  over 
the  900  ensemble  members  (nu  =  60;  Fig.  3b).  It  is  un¬ 
likely  that  such  a  level  of  accuracy  can  be  achieved  in 
real  applications  because  it  may  require  an  ensemble 
size  comparable  with  the  number  of  the  control  variables. 
A  more  realistic  estimate  with  only  60  ensemble  mem¬ 
bers  ( nu  =  4)  still  produces  an  acceptable  approximation 
(Fig.  3c)  to  Fig.  3a,  indicating  that  the  TQ  operator  can  be 
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FIG.  3.  (a)-(c)  Sensitivities  in  the  linear  case,  (a)  Correlation  coefficients  p  with  the  AS  distribution  computed  by 
averaging  over  the  squared  subdomain  are  shown  for  the  ES  estimates  obtained  with  (b)  60  and  (c)  4  ensemble 
updates,  (d)  The  dependence  of  1  —  p  on  the  number  of  ensemble  updates  «„  used  in  the  R4DVAR  ES  estimates. 


described  by  these  members  with  a  reasonable  degree  of 
accuracy.  The  corresponding  sensitivity  distribution  (Fig. 
3c)  appears  to  be  somewhat  smoothed  (cf.  Fig.  3a)  due  to 
the  lack  of  high-frequency  harmonics  in  the  leading  en¬ 
semble  members.  Similar  behavior  was  also  observed  by 
Ancell  and  Flakim  (2007)  in  ES  experiments  with  a 
90-member  ensemble  Kalman  filter  applied  to  a  regional 
Weather  Research  and  Forecasting  Model  (WRF). 

The  accuracy  of  the  approximation  of  the  AS  map  by 
the  ES  estimates  is  shown  in  Fig.  3d  as  a  function  of  the 
number  of  ensemble  members  used.  In  the  linear  case 
considered,  only  nu  =  4  ensemble  updates  (60  members) 
are  sufficient  to  achieve  a  correlation  of  p  =  0.81  with  the 
true  sensitivity  map  shown  in  Fig.  3a. 


From  a  computational  point  of  view,  performing  nu  =  4 
ensemble  updates  (mnu  =  60  perturbed  model  runs)  of 
the  R4DVAR  algorithm  is  approximately  equivalent  to 
28  iterations  of  the  4DVAR  algorithm,  which  actually 
converged  in  240  iterations,  whereas  the  R4DVAR  ap¬ 
proach  required  60  ensemble  updates  (1.6  times  more 
CPU  time)  to  obtain  the  pattern  in  Fig.  3b. 

b.  Nonlinear  case 

The  results  of  assimilation  in  the  nonlinear  case  (true 
solutions  shown  in  Figs.  lc,d)  were  quite  different.  Similar 
to  the  linear  case,  the  R4DVAR  algorithm  converged  after 
60-70  ensemble  updates,  whereas  the  4DVAR  method 
exhibited  a  loss  of  search  direction  after  80-120  iterations. 
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As  a  result,  the  respective  approximation  errors  e  of  the 
true  solution  were  significantly  larger  in  the  4DVAR 
case  than  in  the  R4DVAR  case  ( e  =  0.32-0.44  versus 
0.21-0.28).  On  the  other  hand,  the  CPU  time  required 
for  4DVAR  optimization  was  2.5-4  times  less  than  for 
R4DVAR.  It  should  be  noted,  however,  that  after  8-10 
ensemble  updates,  the  R4DVAR  approximation  error 
was  already  compatible  with  the  one  obtained  by  the 
4DVAR  method. 

In  the  sensitivity  experiments,  we  considered  two  non¬ 
linear  cases:  with  the  stable  and  unstable  adjoint  models. 
The  results  of  the  respective  AS  and  ES  runs  allowed  us  to 
compare  the  pros  and  cons  of  both  techniques  and  in¬ 
vestigate  the  sensitivity  in  the  NL  case  from  a  conceptual 
point  of  view. 

1)  Stable  TL  model 

Figure  4  compares  the  <71  AS  and  ES  estimates  com¬ 
puted  using  linearizations  with  respect  to  the  corre¬ 
sponding  optimal  solutions  against  the  AS  estimate 
computed  in  the  vicinity  of  the  true  solution  (Fig.  lc). 
Similar  to  the  linear  case,  the  ES  map  of  the  true  solution 
(not  shown)  was  almost  identical  (p  =  0.99)  to  the  AS 
map  in  Fig.  4a.  The  major  differences  between  the  sen¬ 
sitivity  maps  in  panels  Figs.  4b-d  and  Fig.  4a  are  due  to 
the  differences  in  the  optimal  states  obtained  by  the 
4DVAR  and  R4DVAR  methods. 

The  approximation  error  of  the  true  sensitivity  by  the 
ES  method  as  a  function  of  the  number  of  ensemble 
updates  is  shown  in  Fig.  3d  by  the  gray  line.  Compared  to 
the  linear  case,  the  WNL  ES  estimates  required  twice  as 
many  ensemble  updates  ( n„  =  9)  to  achieve  a  reasonable 
correlation  (p  =  0.8)  with  the  true  sensitivity.  At  larger  n, 
however,  the  difference  in  p  between  the  linear  and  weakly 
nonlinear  cases  becomes  virtually  indistinguishable. 

2)  Unstable  TL  model 

The  case  of  the  unstable  TL  model  is  the  most  in¬ 
teresting  from  the  conceptual  point  of  view.  In  this  case, 
at  any  given  time  the  TL  operator  has  a  number  of  ex¬ 
ponentially  growing  modes  whose  structure  is  charac¬ 
terized  by  small-scale  spatial  variability.  Numerically, 
this  property  of  the  TL  model  causes  a  considerable 
difference  between  the  forecasts  generated  by  the  un¬ 
perturbed  and  perturbed  model  solutions,  which  shows 
no  linear  dependence  on  the  perturbations’  magnitude. 
Although  this  difference  is  bounded  by  the  conservation 
laws  governing  the  dynamical  system,  normalization  of 
the  model’s  response  by  the  magnitude  of  the  pertur¬ 
bation  produces  unacceptably  large  sensitivity  maps 
dominated  by  the  grid-scale  patterns  (Fig.  5a).  It  is  in¬ 
structive  that  patterns  of  this  type  are  produced  by  both 
the  AS  and  ES  algorithms,  because  the  only  difference 


between  them  is  in  the  method  of  computing  the  de¬ 
rivatives  with  respect  to  the  perturbations.  The  AS 
method  employs  analytical  differentiation  to  obtain  the 
code  for  multiplication  by  Lka  in  (1),  whereas  the  ES 
algorithm  does  this  numerically  by  computing  the  re¬ 
sponse  correlation  vector  p  (4)  and  (pseudo-)  inverting 
the  perturbation  matrix  B. 

As  a  consequence,  one  comes  to  the  conclusion  that 
model  solutions  at  integration  times  exceeding  the  TLI 
time  scale  (Fig.  2)  are  nondifferentiable  numerically. 
This  phenomenon  degrades  the  performance  of  the 
4DVAR  optimization  algorithms  in  strongly  nonlinear 
regimes  and  makes  the  linear  sensitivity  estimates 
worthless.  The  problem  could  be  resolved  by  changing 
the  concept  of  the  sensitivity  analysis  in  the  NL  regimes. 
Given  the  intrinsic  uncertainty  of  an  optimized  solution, 
it  is  more  reasonable  to  consider  a  “composite”  AS  map, 
obtained  by  averaging  over  the  maps  corresponding 
to  linearizations  with  respect  to  the  ensemble  of  solu¬ 
tions  whose  spread  around  the  optimal  one  is  consistent 
with  the  a  posteriori  knowledge  of  the  above-mentioned 
uncertainty. 

Figure  5b  shows  the  result  of  averaging  the  AS  esti¬ 
mates  of  the  true  solution  over  the  ensemble  of  200 
model  runs  generated  by  adding  white  noise  pertur¬ 
bations  to  the  true  initial  distribution  of  the  potential 
vorticity  (the  corresponding  streamfunction  is  shown  in 
Fig.  la).  To  suppress  eventual  instabilities  caused  by 
violation  of  the  Courant-Friedrichs-Lewy  (CFL)  con¬ 
dition,  the  perturbations  were  filtered  with  a  cutoff 
length  scale  of  two  grid  steps.  The  magnitude  f  of  the 
perturbations  (relative  to  the  RMS  magnitude  of  the  true 
solution)  was  equal  to  the  typical  value  of  the  approxi¬ 
mation  error  e  ~  0.3.  It  is  remarkable  that  the  sensitivity 
map  in  Fig.  5b  resembles  the  NL  map  in  Fig.  4a,  although 
the  magnitude  of  the  sensitivity  maxima  in  Fig.  5b  is 
30%-50%  larger  than  in  Fig.  4a.  In  the  following,  we  will 
refer  to  the  ensemble-averaged  sensitivity  (EAS)  map  in 
Fig.  5b  as  the  true  sensitivity  in  the  NL  case. 

Obtaining  EAS  maps  of  the  type  shown  in  Fig.  5b  is 
unfeasible  in  practice  because  it  requires  many  runs  of 
the  model  and  its  adjoint.  Therefore,  numerically  effi¬ 
cient  approximations  to  such  maps  should  be  of  interest. 
A  straightforward  method  of  smoothing  the  AS  map  in 
Fig.  5a  appears  to  be  inefficient.  The  largest  correlation 
p  =  0.34  with  Fig.  5c  is  obtained  with  a  smoothing  scale 
/  =  5.45  and  the  respective  sensitivity  distribution  (Fig.  5c) 
barely  resembles  Fig.  5b.  An  alternative  approach  is  to 
compute  the  sensitivity  using  the  adjoint  model  with  en¬ 
hanced  dissipation  (Hoteit  et  al.  2005).  The  resulting  sta¬ 
bilized  adjoint  sensitivity  (SAS)  map  (Fig.  5d)  correlates 
well  with  Fig.  5b  (p  =  0.80)  but  has  a  somewhat  smaller 
magnitude.  Similarity  between  the  maps  in  Figs.  5b, d 
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FIG.  4.  Sensitivities  in  the  WNL  case:  (a)  true  sensitivity,  (b)  AS  estimate,  and  (c),(d)  ES  estimates.  Estimates  in 
(b)-(d)  were  computed  using  model  linearizations  in  the  vicinity  of  the  optimal  states  obtained  by  the  4DVAR 
method  in  (b)  and  the  R4DVAR  method  with  60  and  20  ensemble  updates  in  (c)  and  (d),  respectively. 


supports  the  considerations  in  section  2b  and  justifies 
the  stabilization  method  of  Hoteit  et  al.  (2005). 

In  real  applications  the  true  state  is  never  known,  so 
we  consider  further  below  the  AS  (ES)  sensitivities  with 
respect  to  the  optimal  states,  obtained  in  the  course  of 
the  4DVAR  (R4DVAR)  optimizations.  The  R4DVAR 
analog  of  the  SAS  technique  requires  integration  of  the 
R4DVAR  perturbation  model  with  the  enhanced  dissi¬ 
pation  to  obtain  a  “stabilized  version”  of  the  correlation 
vector  p,  which  is  then  used  for  computing  an  approxi¬ 
mation  to  the  EAS  map  via  (5).  In  applications,  such 
integration  has  to  be  performed  in  parallel  with  the 


R4DVAR  optimization  process.  Figure  6b  demonstrates 
the  stabilized  ES  (SES)  map  obtained  using  such  an  ap¬ 
proach.  The  sensitivity  pattern  is  well  correlated  (p  = 
0.75)  with  the  EAS  map  in  Fig.  6a,  despite  utilization  of 
the  R4DVAR-optimized  solution  ( e  =  0.28)  as  a  refer¬ 
ence  state  for  the  perturbations.  The  SES  map  in  Fig.  6b 
approximates  the  true  sensitivity  map  in  Fig.  6a  much 
better  than  does  the  SAS  maps  generated  using  the 
4DVAR-  and  stabilized  4DVAR-optimized  solutions  for 
linearization  of  the  adjoint  model,  which  demonstrate 
lower  correlations  (p  =  0.50  and  p  =  0.54,  respectively) 
with  the  true  sensitivity  map  in  Fig.  6a. 
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Fig.  5.  Sensitivity  estimates  in  the  NL  case  computed  by  linearization  in  the  vicinity  of  the  true  state:  (a)  the  AS 
estimate,  (b)  estimate  obtained  by  averaging  over  the  ensemble  of  AS  estimates  computed  by  linearization  in  the 
vicinity  of  the  perturbed  true  states,  (c)  the  smoothed  AS  estimate  (a),  and  (d)  the  AS  estimate  with  diffusive 
stabilization  of  the  adjoint  model. 


To  compare  the  skill  of  the  above-mentioned  sensi¬ 
tivity  computation  techniques  (EAS,  SES,  and  SAS), 
a  series  of  numerical  experiments  has  been  conducted 
with  the  three  different  TQs  shown  in  Fig.  lc  and  de¬ 
scribed  by  (16)  and  (17).  The  results  of  the  experiments 
are  presented  in  Table  1.  It  is  evident  that  both  the  SAS 
and  SES  methods  tend  to  underestimate  the  sensitivity 
magnitude  of  the  (computationally  much  more  expen¬ 
sive)  EAS  method,  but,  nevertheless,  provide  a  reason¬ 
able  approximation  to  the  overall  pattern  (cf.  the  two 
bottom  numbers  in  the  rightmost  column  with  the  top 


ones).  The  R4DVAR  method  appears  to  have  slightly 
better  skill  than  4DVAR,  but  this  is  mostly  due  to  the 
better  approximation  of  the  true  solution  provided  by 
R4DVAR  (e  =  0.28  versus  e  =  0.44).  The  difference 
between  the  4DVAR  and  stabilized  4DVAR  (S4DVAR) 
assimilation  results  was  virtually  indistinguishable,  both 
in  terms  of  the  errors  e  =  0.44-0.46  and  the  accuracy  of 
the  sensitivity  maps. 

We  also  explored  the  dependence  of  the  EAS  esti¬ 
mates  on  the  amplitude  of  the  ensemble  perturbations 
(Fig.  7).  The  U-shaped  curve  is  caused  by  two  types  of 
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FIG.  6.  Sensitivity  estimates  in  the  NL  case  computed  by  linearizations  in  the  vicinity  of  the  optimized  states 
obtained  by  (b)  R4DVAR  (SES  method),  (c)  4DVAR  (SAS  method),  and  (d)  the  stabilized  4DVAR  (SAS  method), 
(a)  For  comparison  the  true  sensitivity  is  shown. 


instabilities:  on  the  one  hand,  perturbations  cannot  be  too 
small  due  to  the  presence  of  growing  modes  in  the  TL 
model,  and  on  the  other  hand,  they  are  limited  from  above 
by  the  CFL  condition.  In  our  case,  the  true  NL  solution 
was  characterized  by  the  maximum  CFL  value  of  0.6.  As 
a  consequence,  imposing  perturbations  with  the  same 
amplitude  as  the  solution  caused  a  violation  of  the  CFL 
stability  criterion.  In  addition,  imposing  perturbations  of 
such  a  large  magnitude  is  questionable  in  itself,  as  the 
reference  model  dynamics  can  be  lost  in  the  background 
of  the  model’s  interactions  with  the  perturbations. 

In  a  sense,  the  magnitude  of  the  perturbations  in  the 
EAS  method  is  clinched  between  the  Scylla  of  the  TL 


instability  and  the  Charybdis  imposed  by  the  above- 
mentioned  natural  limitations.  In  the  reported  exper¬ 
iments,  we  used  the  value  of  £  =  0.3  for  computation  of 
the  true  and  R4DVAR-referenced  sensitivities.  It  is  re¬ 
markable,  however,  that  the  largest  values  of  p  for  the 
4DVAR-  and  S4DVAR-referenced  EAS  estimates  was 
achieved  at  f  =  0.45,  a  value  consistent  with  the  approx¬ 
imation  errors  of  the  respective  optimized  solutions. 

c.  Computational  cost 

ES  estimation  within  the  R4DVAR  algorithm  in  the 
linear  and  WNL  regimes  requires  just  a  small  fraction  of 
the  CPU  time  required  for  data  optimization  because 
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Table  1.  RMS  magnitude  (numbers  to  the  left  within  the  cells) 
and  correlation  coefficient  p  (numbers  to  the  right  within  the  cell) 
of  the  EAS,  SES,  and  SAS  estimates  (respectively,  the  top,  middle, 
and  bottom  numbers  within  the  cells)  for  the  three  TQs.  Compu¬ 
tations  were  performed  using  the  true  reference  state  (rows  1-3) 
and  the  reference  states  obtained  with  the  stabilized  4DVAR 
(S4DVAR;  rows  4-6)  and  R4DVAR  (rows  7-9)  methods.  The 
magnitude  of  the  sensitivity  estimates  is  normalized  by  the  re¬ 
spective  EAS  values. 


Reference 

state 

<7i 

<?2 

<73 

Mean 

True 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.01 

0.90 

0.56 

0.82 

0.64 

0.85 

0.74 

0.86 

1.42 

0.80 

0.83 

0.91 

0.66 

0.81 

0.97 

0.84 

S4DVAR 

0.65 

0.74 

0.63 

0.46 

0.50 

0.79 

0.59 

0.66 

0.88 

0.63 

0.66 

0.37 

0.49 

0.69 

0.68 

0.56 

0.81 

0.54 

0.52 

0.52 

0.43 

0.71 

0.59 

0.60 

R4DVAR 

0.76 

0.81 

0.64 

0.67 

0.85 

0.91 

0.75 

0.80 

0.78 

0.75 

0.55 

0.65 

0.60 

0.76 

0.64 

0.72 

0.78 

0.75 

0.58 

0.69 

0.57 

0.78 

0.64 

0.74 

the  perturbations  of  the  TQs  can  be  effectively  com¬ 
puted  during  the  perturbed  model  runs  in  the  course  of 
minimization  of  the  cost  function.  Upon  completion,  the 
correlation  vector  p  is  easily  computed  together  with  the 
pseudoinversion  of  the  perturbation  matrix.  With  a  rel¬ 
atively  small  (100-500)  number  of  perturbations,  these 
operations  are  not  costly  and  produce  ES  estimates 
comparable  in  accuracy  to  the  AS  ones  (Fig.  4). 

In  the  NL  case,  the  concept  of  sensitivity  has  to  be 
transformed  from  the  deterministic  to  the  statistical 
framework.  As  a  consequence,  direct  numerical  sensi¬ 
tivity  estimates  appear  to  be  computationally  unfeasible 
and  require  efficient  approximation  methods.  Extension 
to  the  R4DVAR  case  of  the  adjoint  stabilization  tech¬ 
nique  proposed  by  Hoteit  et  al.  (2005)  requires  addi¬ 
tional  integration  of  the  stabilized  perturbation  model, 
which  can  be  executed  in  parallel  with  the  ensemble  runs 
of  the  R4DVAR  optimization  algorithm.  The  twofold 
increase  of  the  computational  requirements  can  be  partly 
justified  by  the  improved  performance  of  the  R4DVAR 
algorithm  in  the  NL  regimes. 

5.  Summary  and  discussion 

In  many  geophysical  problems,  application  of  the 
4DVAR  data  assimilation  technique  encounters  diffi¬ 
culties  associated  with  the  emergence  of  exponentially 
growing  modes  of  the  TL  operator  in  flow  regimes 
characterized  by  nonlinear  instabilities.  A  straightforward 
way  to  deal  with  the  problem  is  to  either  increase  the 
diffusion  coefficient  in  the  adjoint  of  the  background  nu¬ 
merical  model  (Hoteit  et  al.  2005;  Zhang  et  al.  2011)  or  to 
reduce  the  duration  of  the  data  assimilation  window  to  the 


Fig.  7.  Sensitivity  magnitude  estimated  by  the  EAS  method  as 
a  function  of  the  relative  perturbation  amplitude  f.  Dashed  line 
shows  |S|  computed  by  the  SAS  technique. 


smallest  e-folding  time  scales  of  the  instabilities  (Xu  and 
Daley  2000).  The  latter  approach  is  acceptable  in  op¬ 
erational  meteorology,  which  enjoys  dense  daily  cov¬ 
erage  by  observations,  but  is  rarely  used  in  oceanography 
(Ngodock  et  al.  2009)  where  observations  within  the 
ocean  interior  are  sparse. 

Within  the  context  of  a  sensitivity  analysis,  NL  systems 
require  a  conceptual  transition  from  a  deterministic  (ad¬ 
joint  based)  to  a  probabilistic  (ensemble  based)  method¬ 
ology.  In  this  study  we  have  shown  that  AS  estimates 
stabilized  by  enhanced  dissipation  are  capable  of  pro¬ 
viding  a  good  approximation  to  the  probabilistic  sen¬ 
sitivities  in  the  NL  regimes. 

Special  attention  has  been  paid  to  the  comparison  of 
the  sensitivity  estimates  obtained  with  the  ensemble- 
based  R4DVAR  data  assimilation  technique  and  with 
the  4DVAR  method  in  both  linear  and  nonlinear  re¬ 
gimes.  In  the  linear  and  WNL  regimes,  the  R4DVAR  ES 
estimates  are  similar  in  accuracy  to  the  AS  ones  and  do 
not  require  additional  computations  if  the  perturbations 
of  the  TQs  are  computed  in  the  process  of  R4DVAR 
optimization.  In  the  NL  case,  the  ensemble  analog  of  the 
stabilized  AS  technique  is  developed.  The  proposed  sta¬ 
bilized  ES  method  is  similar  in  accuracy  to  the  SAS  ap¬ 
proach,  but  requires  a  twofold  increase  in  computer  time 
due  to  the  necessity  of  augmenting  the  R4DVAR  opti¬ 
mization  process  with  integration  of  the  stabilized  per¬ 
turbation  model. 

The  R4DVAR  ES  technique  is  similar  to  the  one 
considered  by  Ancell  and  Hakim  (2007)  with  the  differ¬ 
ence  that  the  ensemble  averages  are  not  removed  and  the 
correlation  vector  p  is  always  multiplied  by  the  pseudo¬ 
inverse  of  the  ensemble  matrix.  A  number  of  computa¬ 
tions  were  performed  with  the  removal  of  the  ensemble 
averages  and/or  replacement  of  B 1  in  (5)  by  the  diagonal 
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matrix  of  the  inverse  ensemble  variances  (Torn  and 
Hakim  2008),  but  these  experiments  consistently  pro¬ 
duced  a  decrease  in  the  accuracy  of  the  ES  estimates.  To 
some  extent,  the  result  can  be  explained  by  the  fact  that 
the  considered  TQs  have  a  noticeable  projection  on  the 
ensemble-mean  state  whose  removal  increases  the  error 
of  the  approximation  of  the  true  sensitivity. 

Theoretical  arguments  (section  2b)  supported  by  nu¬ 
merical  experiments  have  shown  the  ability  of  the  SES 
and  SAS  methods  to  effectively  approximate  the  EAS 
sensitivity  maps  in  the  NL  regime.  Confirming  this  result 
with  a  real  OGCM  requires  particular  care  in  the  defi¬ 
nition  of  the  amplitude  of  the  ensemble  perturbations 
(section  4b),  which  contain  multivariate  state  vectors 
and  may  be  affected  by  the  higher-order  nonlinearities 
present  in  the  model. 

Extending  the  R4DVAR  ES  technique  to  optimi¬ 
zation  (with  respect  to  TQs)  of  the  observational  net¬ 
works  (observation  targeting)  appears  to  be  a  feasible 
task,  because  products  of  the  ensemble  members  by 
the  Hessian  of  the  assimilation  problem  are  among  the 
outputs  of  the  R4DVAR  optimization  process.  A  similar 
approach  to  observation  targeting  can  be  undertaken  with 
the  4DVAR  method,  which  generates  a  sequence  of  de¬ 
scent  directions  in  the  process  of  minimizing  the  cost 
function. 

The  linearization  technique  underlying  the  4DVAR 
and  AS  analysis  imposes  natural  limitations  on  the 
performance  of  the  adjoint  methods  in  NL  regimes.  This 
opens  further  prospects  for  the  development  of  the  en¬ 
semble  approach  in  sensitivity  analysis  and  optimization 
of  observational  networks. 
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APPENDIX 

Averaging  the  Advective  Propagator 

We  denote  the  covariance  of  the  time-averaged  per¬ 
turbation  velocities  by 

Y  =  (fvT) 


and  expand  the  ensemble  mean  of  the  advective  prop¬ 
agator  in  the  Taylor  series: 


<e*p[-,-V])=i«=LTA  (A1) 

k=0  K- 

Assuming  that  v  is  nondivergent,  the  even-order  terms 
of  the  expansion  can  be  rearranged  in  the  form 

((- v  ■  V)2m)  _  [V(rrT)Vf  _  1 
(2/7;)!  2  mm\  ml 


vvv 

2 


(A2) 


whereas  odd-order  terms  vanish  due  to  Gaussianity. 
Substitution  of  the  rhs  from  (A2)  into  (Al)  yields 


00  i 

(exp[— v  ■  V])  =  X  -y 

m=0m] 


'VVV' 

m 

i  _ 

2 

=  exp 

-VVV 

2 

(A3) 


Comparing  (A3)  with  (10)  provides  the  relationship  be¬ 
tween  the  effective  diffusion  tensor  D  and  the  covariance 
tensor  V  of  the  time-averaged  perturbation  velocities: 

D  =  |v.  (A4) 

The  operator  exp(—  Tv  ■  V)  is  a  propagator  by  a  ran¬ 
dom  velocity  field,  a  process  similar  to  classic  Brownian 
motion.  Therefore,  one  may  expect  that,  after  averaging 
over  iJ,  only  the  spatial  scales  exceeding  T \/trV  will  re¬ 
main;  that  is,  this  operator  effectively  removes  the  small- 
scale  components  of  the  field  it  acts  upon.  The  assumption 
that  the  largest  principal  axis  of  V  tends  to  be  aligned  with 
the  optimal  (analyzed)  current  is  one  of  the  underlying 
principles  of  flow-dependent  background  error  modeling 
in  data  assimilation  (e.g..  Purser  et  al.  2003;  Xu  2005; 
Mirouse  and  Weaver  2010),  which  has  gained  consider¬ 
able  attention  in  recent  years  (Yaremchuk  et  al.  2013). 
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